Datasets and models associated with the paper "Large-Scale Data Selection for Instruction Tuning" (https://arxiv.org/abs/2503.01807)
Hamish Ivison
hamishivi
AI & ML interests
NLP :)
Recent Activity
updated
a dataset
about 23 hours ago
hamishivi/combined_o3_val_data_1sample
updated
a dataset
about 23 hours ago
hamishivi/o3_generations_big_rl
published
a dataset
1 day ago
hamishivi/o3_generations_big_rl
Organizations
models
34

hamishivi/s1k_seq_orig_hyper__42__1740446762
Updated
•
2

hamishivi/tulu_3_long_finetune_qwen_7b_reg_system_prompt
Updated
•
4

hamishivi/tulu-2-wildchat-326k-sft
Updated

hamishivi/tulu-2-arena-hard-326k-sft
Updated
•
3

hamishivi/llama-3.1-tulu-3-arena-hard-939k-sft
Updated
•
3

hamishivi/llama-3.1-tulu-3-multitask-rrmax-939k-sft
Updated
•
4

hamishivi/tulu-2-multitask-rrmax-326k-sft
Updated
•
3

hamishivi/qwen2_math_tokenizer_tweaked
Updated

hamishivi/0224_jupiter_hamish_grpo_tulu3_s1k_orz_30350
Updated

hamishivi/0224_jupiter_hamish_grpo_s1k_only_orz_24021
Updated
•
1
datasets
88
hamishivi/o3_generations_big_rl
Viewer
•
Updated
•
258k
•
15
hamishivi/combined_o3_val_data_1
Viewer
•
Updated
•
9.25k
•
14
hamishivi/o3-test
Viewer
•
Updated
•
99
•
33
hamishivi/combined_o3_val_data_1sample
Viewer
•
Updated
•
2.46M
•
98
hamishivi/combined_o3_val_data
Viewer
•
Updated
•
12.3M
•
45
hamishivi/WebInstruct-verified
Viewer
•
Updated
•
233k
•
47
hamishivi/tulu_3_rewritten_400k
Viewer
•
Updated
•
395k
•
50
hamishivi/tulu_3_rewritten_100k
Viewer
•
Updated
•
80.3k
•
266
hamishivi/logic_lm
Viewer
•
Updated
•
1.2k
•
128
hamishivi/logic_701
Viewer
•
Updated
•
701
•
127